智能论文笔记

A Flexible HLS Hoeffding Tree Implementation for Runtime Learning on FPGA

Luís Miguel Sousa , Nuno Paulino , João Canas Ferreira , João Bispo

分类：机器学习

2021-12-03

当为其简单和可扩展性实现嵌入式系统中的机器学习时，通常优选决策树。 Hoeffding树是一种决策树，其利用霍夫特队允许他们学习数据中的模式而无需连续地存储数据样本以供将来进行再处理。这使它们特别适合在嵌入式设备上进行部署。在这项工作中，我们突出了HOEFFD树的HLS实现的特征。实现参数包括样本（d）的特征大小，输出类（k）的数量，以及允许树被允许生长的最大节点数量（nd）。我们针对Xilinx MPSoC ZCU102，评估：设计的资源需求和时钟频率，不同数量的类和特征大小，执行时间在不同样本大小（n）的若干合成数据集，输出类数量和执行时间和执行时间从UCI的两个数据集的准确性。对于D3，K5和N40000的问题大小，在103MHz上运行的单个决策树能够比1.2GHz ARM Cortex-A53核心更快推理8.3倍。与Hoeffding树的参考实现相比，我们为UCI数据集实现了可比的分类准确性。

translated by 谷歌翻译

Multiple target tracking with interaction using an MCMC MRF Particle Filter

Helder F. S. Campos , Nuno Paulino

分类：计算机视觉

2021-11-25

本文提出并讨论了多个目标跟踪方法的实现，它能够处理目标交互，防止由于劫持而防止跟踪器失败。参考方法使用Markov链蒙特卡罗（MCMC）采样步骤来评估过滤器并构建有效的提案密度以产生新的样品。该密度基于每个时间步骤生成的Markov随机字段（MRF）集成了目标交互项。 MRFS模拟目标之间的相互作用，以减少典型粒子滤波器在跟踪多个目标时遭受的跟踪模糊性。在受限空间中包含20个相互作用蚂蚁的662灰度帧的测试序列用于测试所提出的方法和基于一个重要的自动粒子过滤器，以建立性能比较。结果表明，使用MRF建模目标交互的实现方法成功地校正了独立，交互不知道粒子过滤器的许多跟踪误差。

translated by 谷歌翻译

Augmentation of base classifier performance via HMMs on a handwritten character data set

Hélder Campos , Nuno Paulino

分类：计算机视觉 | 机器学习

2021-11-17

本文介绍了几个基本分类器的表现研究，以识别现代拉丁字母的手写字符。通过利用维特比序列来通过确定维特比序列来进一步增强基础分类性能。隐藏的Markov模型（HMMS）模型在一个单词中的字母之间的关系挖掘，以确定最可能的字符序列。研究了四个基本分类器以及从手写数据集中提取的八个功能集。纠正后的最佳分类性能为89.8％，平均为68.1％

translated by 谷歌翻译

3DSGrasp: 3D Shape-Completion for Robotic Grasp

Seyed S. Mohammadi , Nuno F. Duarte , Dimitris Dimou , Yiming Wang , Matteo Taiana , Pietro Morerio , Atabak Dehban , Plinio Moreno , Alexandre Bernardino , Alessio Del Bue

分类：机器人 | 人工智能

2023-01-02

Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available. However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses. We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry from the partial PCD to produce reliable grasp poses. Our proposed PCD completion network is a Transformer-based encoder-decoder network with an Offset-Attention layer. Our network is inherently invariant to the object pose and point's permutation, which generates PCDs that are geometrically consistent and completed properly. Experiments on a wide range of partial PCD show that 3DSGrasp outperforms the best state-of-the-art method on PCD completion tasks and largely improves the grasping success rate in real-world scenarios. The code and dataset will be made available upon acceptance.

translated by 谷歌翻译

Anxolotl, an Anxiety Companion App -- Stress Detection

Nuno Gomes , Matilde Pato , Pedro Santos , André Lourenço , Lourenço Rodrigues

分类：机器学习

2022-12-28

Stress has a great effect on people's lives that can not be understated. While it can be good, since it helps humans to adapt to new and different situations, it can also be harmful when not dealt with properly, leading to chronic stress. The objective of this paper is developing a stress monitoring solution, that can be used in real life, while being able to tackle this challenge in a positive way. The SMILE data set was provided to team Anxolotl, and all it was needed was to develop a robust model. We developed a supervised learning model for classification in Python, presenting the final result of 64.1% in accuracy and a f1-score of 54.96%. The resulting solution stood the robustness test, presenting low variation between runs, which was a major point for it's possible integration in the Anxolotl app in the future.

translated by 谷歌翻译

MN-DS: A Multilabeled News Dataset for News Articles Hierarchical Classification

Alina Petukhova , Nuno Fachada

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-22

This article presents a dataset of 10,917 news articles with hierarchical news categories collected between January 1st 2019, and December 31st 2019. We manually labelled the articles based on a hierarchical taxonomy with 17 first-level and 109 second-level categories. This dataset can be used to train machine learning models for automatically classifying news articles by topic. This dataset can be helpful for researchers working on news structuring, classification, and predicting future events based on released news.

translated by 谷歌翻译

Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation

Nuno M. Guerreiro , Pierre Colombo , Pablo Piantanida , André F. T. Martins

分类：自然语言处理 | 机器学习

2022-12-19

Neural machine translation (NMT) has become the de-facto standard in real-world machine translation applications. However, NMT models can unpredictably produce severely pathological translations, known as hallucinations, that seriously undermine user trust. It becomes thus crucial to implement effective preventive strategies to guarantee their proper functioning. In this paper, we address the problem of hallucination detection in NMT by following a simple intuition: as hallucinations are detached from the source content, they exhibit encoder-decoder attention patterns that are statistically different from those of good quality translations. We frame this problem with an optimal transport formulation and propose a fully unsupervised, plug-in detector that can be used with any attention-based NMT model. Experimental results show that our detector not only outperforms all previous model-based detectors, but is also competitive with detectors that employ large models trained on millions of samples.

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Robustness Evaluation of Regression Tasks with Skewed Domain Preferences

Nuno Costa , Nuno Moniz

分类：机器学习

2022-12-15

In natural phenomena, data distributions often deviate from normality. One can think of cataclysms as a self-explanatory example: events that occur almost never, and at the same time are many standard deviations away from the common outcome. In many scientific contexts it is exactly these tail events that researchers are most interested in anticipating, so that adequate measures can be taken to prevent or attenuate a major impact on society. Despite such efforts, we have yet to provide definite answers to crucial issues in evaluating predictive solutions in domains such as weather, pollution, health. In this paper, we deal with two encapsulated problems simultaneously. First, assessing the performance of regression models when non-uniform preferences apply - not all values are equally relevant concerning the accuracy of their prediction, and there's a particular interest in the most extreme values. Second, assessing the robustness of models when dealing with uncertainty regarding the actual underlying distribution of values relevant for such problems. We show how different levels of relevance associated with target values may impact experimental conclusions, and demonstrate the practical utility of the proposed methods.

translated by 谷歌翻译

DISCO: Adversarial Defense with Local Implicit Functions

Chih-Hui Ho , Nuno Vasconcelos

分类：计算机视觉

2022-12-11

The problem of adversarial defenses for image classification, where the goal is to robustify a classifier against adversarial examples, is considered. Inspired by the hypothesis that these examples lie beyond the natural image manifold, a novel aDversarIal defenSe with local impliCit functiOns (DISCO) is proposed to remove adversarial perturbations by localized manifold projections. DISCO consumes an adversarial image and a query pixel location and outputs a clean RGB value at the location. It is implemented with an encoder and a local implicit module, where the former produces per-pixel deep features and the latter uses the features in the neighborhood of query pixel for predicting the clean RGB value. Extensive experiments demonstrate that both DISCO and its cascade version outperform prior defenses, regardless of whether the defense is known to the attacker. DISCO is also shown to be data and parameter efficient and to mount defenses that transfers across datasets, classifiers and attacks.

translated by 谷歌翻译